9 research outputs found

    Towards the text compression based feature extraction in high impedance fault detection

    Get PDF
    High impedance faults of medium voltage overhead lines with covered conductors can be identified by the presence of partial discharges. Despite it is a subject of research for more than 60 years, online partial discharges detection is always a challenge, especially in environment with heavy background noise. In this paper, a new approach for partial discharge pattern recognition is presented. All results were obtained on data, acquired from real 22 kV medium voltage overhead power line with covered conductors. The proposed method is based on a text compression algorithm and it serves as a signal similarity estimation, applied for the first time on partial discharge pattern. Its relevancy is examined by three different variations of classification model. The improvement gained on an already deployed model proves its quality.Web of Science1211art. no. 214

    Využití entropie v textové podobnosti

    No full text
    Import 02/11/2016In our computerized world, computers and users produce an enormous quantum of new data every day. One of the most challenging problems of the modern informatics and computer sciences is the detection of similarities and differences between large amounts of these documents. The presented dissertation thesis focuses on the entropy utilization in the text similarity. The text similarity can be measured by compression-based similarity metrics. Their application is shown in three areas. The first area deals with spam detection, where an incoming e-mail is classified into two classes -- solicited or unsolicited -- spam e-mail. This classification can be done by Bayesian Spam filter. This filter is extended with Normalized Compression Distance and e-mail signatures. This conjunction gives us better results as standalone Bayesian Spam filter. The second area of interest is plagiarism detection. Nowadays we are producing a lot of various types of documents, such as reports, thesis in the school environment, etc. The retrieval and extraction of reused text from large document collections are important to applications such as plagiarism detection, copyright protection, and information flow analysis. To solve these issues, this thesis presents algorithms, which can detect similar -- plagiarized documents. The proposed method is also inspired by the data compression but in different way. The method is using only some initialization parts of the compression algorithm and its modifications. The last part shows how the Encephalography (EEG) data can be processed as text documents. At first, this data has to be converted from measured voltages into text codes. The described conversion of data is performed by Turtle Graphic and coded into text. After such a conversion, the EEG data can be treated and classified by compression-based similarity metric. This transformation of EEG data is applicable to detection of simple cognitive tasks, for example, finger movements.V dnešním počítačovém světě, počítače a jejich uživatelé produkují každý den enormní kvanta nových dat. Jedním z nejnáročnějších problémů moderní informatiky a počítačových věd je odhalení podobností a rozdílů mezi velkým množstvím dokumentů. Předkládaná disertační práce je zaměřena na využití entropie v oblasti určení podobnosti textů. Samotná podobnost textů může být měřena metrikou založenou na bázi komprese dat. Její aplikace je demonstrována ve třech oblastech. První oblast se zabývá detekci spamů, kdy jsou příchozí e-mailové zprávy rozděleny do dvou tříd - vyžádaná či nevyžádaná - spam. Ke zmíněné klasifikaci může být použit Bayesův spamový filtr. Tento filtr je rozšířen o normalizovanou kompresní vzdálenost a signatury emailů. Toto spojení přináší lepší výsledky než při samostatném použití Bayesova spamového filtru. Druhou oblastí zájmu je detekce plagiátů. V současné době je generováno mnoho různých typů dokumentů, jako jsou zprávy, absolventské práce atd. Získávání a extrakce využitých textů z velkých sbírek dokumentů jsou důležité pro aplikace, jako je detekce plagiátů, ochrana autorských práv a analýza toku informací. K řešení nastíněných problémů nabízí předkládaná práce algoritmy, které dokáží detekovat dokumenty podobné - plagiáty. Navrhovaná metoda je také inspirována v oblasti komprese dat, ale jiným způsobem. Metoda využívá pouze některé inicializační části kompresního algoritmu a jejich modifikace. Poslední část práce představuje zpracování encefalografických (EEG) dat jakožto textových dokumentů. Nejprve však tato data musí být převedena z naměřených napěťových průběhů do textové podoby. Popsaná konverze dat se provádí pomocí želví grafiky a následnému kódování do textu. Po takto provedené konverzi mohou být EEG data zpracována a klasifikována s využitím metriky založené na bázi komprese dat. Tuto transformaci EEG dat je možno využít k detekci jednoduchých kognitivních funkcí, například pohybů prstů.460 - Katedra informatikyvyhově

    Generování menu pro PowerPoint 2007

    No full text
    Import 22/10/2008Prezenční456 - Katedra informatikyNeuveden

    Geographic Community Portal Using Adobe Air

    Get PDF
    Import 29/09/2010Diplomová práca prináša popis technológie Adobe AIR, popis základných API funkcií pre tvorbu užívateľského rozhrania, prístup k lokálnemu súborovému systému, komunikáciu po sieti, prácu s médiami. V práci sa nachádza aj popis nástrojov na ladenie a zostavenie inštalačného balíčka. Možnosti technológie Adobe AIR sú ukázané na jednoduchom geografickom portáli. Na tomto portáli sa nachádzajú body záujmu, GPS trasy a fotografie usporiadané v kategóriách.The thesis provides a description of the Adobe AIR technology, a description of the basic API functions for creating user interface, accessing the local file system, network communication and working with the media. The work also includes a description of the tools for debugging and building the installation package. Adobe AIR technology options are shown/ demonstrated on the simple geographic portal. This portal contains points of interest, GPS tracks and photos organized into categories.Prezenční456 - Katedra informatikyvelmi dobř

    Medical image retrieval using vector quantization and fuzzy S-tree

    No full text
    The aim of the article is to present a novel method for fuzzy medical image retrieval (FMIR) using vector quantization (VQ) with fuzzy signatures in conjunction with fuzzy S-trees. In past times, a task of similar pictures searching was not based on searching for similar content (e.g. shapes, colour) of the pictures but on the picture name. There exist some methods for the same purpose, but there is still some space for development of more efficient methods. The proposed image retrieval system is used for finding similar images, in our case in the medical area - in mammography, in addition to the creation of the list of similar images - cases. The created list is used for assessing the nature of the finding - whether the medical finding is malignant or benign. The suggested method is compared to the method using Normalized Compression Distance (NCD) instead of fuzzy signatures and fuzzy S-tree. The method with NCD is useful for the creation of the list of similar cases for malignancy assessment, but it is not able to capture the area of interest in the image. The proposed method is going to be added to the complex decision support system to help to determine appropriate healthcare according to the experiences of similar, previous cases.Web of Science412art. no. 1

    Spam detection using data compression and signatures

    No full text
    In this article, we introduce a novel method for spam detection based on a combination of Bayesian filtering, signature trees, and data compression–based similarity. Bayesian filtering is one of the most popular and most efficient algorithms for dealing with spam detection. The problem with Bayesian filtering is that it is unable to classify any e-mail without doubt and sometimes spam e-mails are classified as regular e-mails. This novel method sorts out this problem by using signature trees and data compression–based similarity. The main result of this article is an up to 99% improvement in spam detection precision using this novel method.Web of Science446-754953

    Graph visualisation by concurrent differential evolution

    No full text
    A representative dimensionality reduction is an important step in the analysis of real-world data. Vast amounts of raw data are generated by cyberphysical and information systems in different domains. They often feature a combination of high dimensionality, large volume, and vague, loosely defined structure. The main goal of visual data analysis is an intuitive, comprehensible, efficient, and graphically appealing representation of information and knowledge that can be found in such collections. In order to achieve an efficient visualisation, raw data need to be transformed into a refined form suitable for machine and human analysis. Various methods of dimension reduction and projection to low-dimensional spaces are used to accomplish this task. Sammon's projection is a well-known non-linear projection algorithm valued for its ability to preserve dependencies from an original high-dimensional data space in the low-dimensional projection space. Recently, it has been shown that bio-inspired real-parameter optimization methods can be used to implement the Sammon's projection on data from the domain of social networks. This work investigates the ability of several advanced types of the differential evolution algorithm as well as their parallel variants to minimize the error function of the Sammon's projection and compares their results and performance to a traditional heuristic algorithm.Web of Science25438636

    Medical Image Retrieval Using Vector Quantization and Fuzzy S-tree

    No full text
    corecore